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SEQUEN'CS LISTING 


(1) GENERAL INFORMATION: 


(i) APPLICANT: GIJZEN, Mark 
V; (ii) TITLE OF INVENTION: SEED COAT SPECIFIC DNA REGULATORY REGION 


AND PEROXIDASE 


(iii) NUMBER OF SEQUENCES: 2 


(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: NIXON & VANDERHYE P.O. 

(B) STREET: 8th Floor, 1100 North Glebe Road 

(C) CITY: Arlington 

(D) STATE: Virginia 

(E) COUNTRY: United States 

(F) ZIP: 22201-4714 


(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disK 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patencin Release #1.0, Version #1.30 


(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 26-SEP-199 

(C) CLASSIFICATION: 


[vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/723,414 

(B) FILING DATE: 30-SEP-1996 

^iii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: BYRNE, Thomas E. 

(B) REGISTRATION NUMBER: 32,205 

(C) REFERENCE/DOCKET NUMBER: 76-105 


(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (703) 816-4021 

(B) TELEFAX: (703) 816-4100 


(2) INFORMATION FOR SEQ ID NO: 1: 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1244 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 


- 42 - 

(ii) MOLECXTLE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . - 1056 

(ix) FEATURE: 

(A) NAME/KEY: sig_peptide 

(B) LOCATION: 1 . .11 


48 


96 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 

ATG GOT TCC ATG CGT CTA TTA GTA GTG GCA TTG TTG TGT GCA TTT GCT 
Met Gly Ser Met Arg Leu Leu Val Val Ala Leu Leu Cys Ala Phe Ala 
15 10 15 

ATG CAT GCA GGT TTT TCA GTC TCT TAT GCT CAG CTT ACT CCT ACG TTC 
Met His Ala Gly Phe Ser Val Ser Tyr Ala Gin Leu Thr Pro Thr Phe 
20 25 30 


TAC AGA GAA ACA TGT CCA AAT CTG TTC CCT ATT GTG TTT GGA GTA ATC 14 4 

Tyr Arg Glu Thr Cys Pro Asn Leu Phe Pro lie Val Phe Gly Val He 
35 40 45 

TTC GAT GCT TCT TTG ACC GAT CCC CGA ATC GGG GCC AGT CTC ATG AGG 192 


# 
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e ASP Ala Ser Phe Thr Asp Pro Arg He Gly Ala Ser Leu Met Arg 


Ph 

cr 60 

50 


CTT CAT TTT CAT GAT TGC TTT GTT CAA GOT TOT GAT GGA TCA GTT TTG 
Leu His Phe H.s Asp Cys Phe Val Gin Gly Cys Asp Gly Ser Val Leu 
65 70 75 

CTG AAC AAC ACT GAT ACA ATA GAA AGC GAG CAA GAT GCA CTT CCA AAT 


85 


90 


95 


ATC AAC TCA ATA AGA GGA TTG GAC GTT GTC AAT GAC ATC AAG ACA GCG 
lie Asn ser He Arg Gly Leu Asp Val Val Asn Asp He Lys Thr Ala 


100 


val Glu Asn Ser Cys Pro Asp Thr Val Ser Cys Ala Asp He Leu Ala 

125 


115 120 


lie Ala Ala Glu He Ala Ser Val Leu Gly Gly Gly Pro Gly Trp Pro 
130 

1 pro Leu Gly Arg Arg Asp Ser Leu Thr Ala Asn Arg Thr Leu Ala 


Va 

150 

145 


160 


CAA AAC CTT CCA GCA CCT TTC TTC AAC CTC ACT CAA CTT AAA GCT 
Asn Gin Asn Leu Pro Ala Pro Phe Phe Asn Leu Thr Gin Leu Lys Ala 


170 175 
165 1'" 


240 


336 


3 84 


432 


480 


52! 




- 44 - 

TCC TTT GCT GTT CAA GGT CTC AAC ACC CTT GAT TTA GTT ACA CTC TCA 
Ser Phe Ala Val Gin Gly Leu Asn Thr Leu Asp Leu Val Thr Leu Ser 


180 


185 


190 


GGT GGT CAT ACG TTT GGA AGA GCT CGG TGC AGT ACA TTC ATA AAC CGA 
Gly Gly His Thr Phe Gly Arg Ala Arg O/s Ser Thr Phe He Asn Arg 
195 200 205 

TTA TAC AAC TTC AGC AAC ACT GGA AAC CCT GAT CCA ACT CTG AAC ACA 
Leu Tyr Asn Phe Ser Asn Thr Gly Asn Pro Asp Pro Thr Leu Asn Thr 


210 


215 


220 


ACA TAC TTA GAA GTA TTG CGT OCA AGA TGC CCC CAG AAT GCA ACT GGG 

- -, _ r^,,^ D-rr^ nl n Asn Ala Thr Gly 

Thr Tyr Leu Glu Vai Leu Arg Axa ^xy ^-/^ 


225 


230 


235 


240 


GAT AAC CTC ACC AAT TTG GAC CTG AGC ACA CCT GAT CAA TTT GAC AAC 
ASP Asn Leu Thr Asn Leu Asp Leu Ser Thr Pro Asp Gin Phe Asp Asn 


245 


250 


255 


Arg Tyr Tyr Ser Asn Leu Leu Gin Leu Asn Gly Leu Leu Gin Ser Asp 
260 

CAA GAA CTT TTC TCC ACT CCT GGT GCT GAT ACC ATT CCC ATT GTC AAT 
Gin Glu Leu Phe Ser Thr Pro Gly Ala Asp Thr He Pro He Val Asn 


275 


280 


285 


ser Phe Ser Ser Asn Gin Asn Thr Phe Phe Ser Asn Phe Arg Val Ser 


576 


624 


672 


720 


816 


864 


912 
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ATG ATA AAA ATG GGT AAT ATT GGA GTG CTG ACT GGG GAT GAA GGA GAA 960 

Met lie Lys Met Gly Asn lie Gly Val Leu Thr Gly Asp Glu Gly Glu 
305 310 315 320 

ATT CGC TTG CAA TGT AAT TTT GTG AAT GGA GAG TCG TTT GGA TTA GOT 10 0 8 

lie Arg Leu Gin Cys Asn Phe Val Asn Gly Asp Ser Phe Gly Leu Ala 
325 330 335 

AGT GTG GCG TCC AAA GAT GCT AAA CAA AAG CTT GTT GCT CAA TCT AAA 10 56 

Ser Val Ala Ser Lys Asp Ala Lys Gin Lys Leu Val Ala Gin Ser Lys 

340 345 350 

TAAACCAATA ATTAATGGGG ATGTGCATGC TAGCTAGCAT GTAAAGGCAA ATTAGGTTGT 1116 
AAACCTCTTT GCTAGCTATA TTGAAATAAA CCAAAGGAGT AGTGTGCATG TCAATTCGAT 1176 
TTTGCCATGT ACCTCTTGGA ATATTATGTA ATAATTATTT GAATCTCTTT AAGGTACTTA 123 6 


ATTAATCA 


1244 


(2) INFORMATION FOR SEQ ID NO : 2: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4700 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 


- 46 


(ix) FEATURE: 

(A) NAME/KEY: promoter 

(B) LOCATION:!. .153 2 

(ix) FEATURE: 

(A) NAME/KEY: sig_peptide 

(B) LOCATION: 1533. .1609 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION : 1533 . .1751 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 2383 . .2574 

( ix) FEATURE : 

(A) NAME/KEY : exon 

(B) LOCATION: 3605 . .3769 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION:4033 . .4516 

(ix) FEATURE; 

(A) NAME /KEY: intron 

(B) LOCATION: 1752. .1782 

( ix) FEATURE : 

(A) NAME /KEY: intron 
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(B) LOCATION: 2575. .3604 

(ix) FEATTJRE: 

(A) NAME/KEY: intron 

(B) LOCATION; 3770 . .4032 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION : 1533 .. 1751 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 23 83 . .2574 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3605 .. 3769 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION:4033 . .4515 


50 


(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
TAGATAAAAA AATGGGATAT AATTTTTCTC AGATGTTGTT TATACTGTTT TTTTAATCAG 
AATTAAAATT CCTCTTTAAT TATCGACATA ATTTTTTTTG GTGAATATTA TCGACATAAT 120 
TATTTAATAC AAATTTTTAT TGTACATAGA AGTGATACTT CAATTTTAAT ATTGGAGAAC 180 
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AGTACGAAAA CATAAAAAAA CTGTTATTAG AAGAAAAAAA TATATGGAAA AGGTTAGCTA 24 0 

CATATATTAG CTAAATTAGT TGTTCTAATT GGCTATATAA ACCCTATTGT ACTCTTTGTA 3 00 

ATCTCACCTT TTTCATTTAA ATACATTTCT ACTTTTTAAG TTCTATATTT TCTCTCAATT 360 

TTCTTCGATA AACCATGAAA TTTAACATGG TATATCAGCG ATACCACCCA CTTTGAAAGC 42 0 

CATGTATGGC TAGTATGGGC AGCCAAAATT TGCCCTGGTT CAAGCAAAGC AAGTGTTTAT 4 80 

ATAGATGTGA CTTTTGTTGA GGAACTCATG CCAATGGTAC TGATTGTGAA ACTGAGAAAA 540 

CTAATTTGGA GAATTTGAAT TATGATCATT AAATACTCCT CTCCTGACTA CCTTCGTCCC 6 00 

TCAAATTTGT ACCATCATTA TTTCCCAAAA ATTTGATTAC AATGCACTAA TTAATGAATG 660 

TTTCTTACAT TATCATATTA TCATATCTGA CATTTTGTTT TTACTTTTTA TAATAATTAT 72 0 

TTTAAAAAGT CATACATGCA AATAATTTTT TAATAGTTTA CAGTTAAATT TTTACAGTAA 78 0 

AAATGCATGA AAATTAAACT TTATTTTTCC AAGTCATCAT TTAGTCAAAT CCCAAAACAA 84 0 

TGATTATTTT TTGCAAATGA ATGTTTATTG AACATTTAAA TGTAGCCTAA TTAATTCTGG 900 

TTATGGTGTC AATGTTCCAA AACCTAATGC AAGATCTTAG CAAGTACATA CATAGATCTA 96 0 

ATTTTAAACT TATCTTTACG CAAGAGATAT AAAGATTATA CATCTAGTTT TAAACATTAA 10 2 0 

CTTTTGTTTT TGTGTTAAAA AACAGTAACA TTTTCTTAAT TTTGTAGAGT GACGTGCTCC 10 80 

AACCATATTA ACGAAGATTT TAATTGGTAT TCAAGTTCAT GAACTTAGTA AATAAGTTTT 114 0 
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OOTCXTO^OT X..™ -™ ^ 

^XTOCTTO .OT3C.CC« CC.™ 0.0~ ^T.T TC~T ...0 
„CC. =CC™C«X — ™ "CCCC^T ^.C^C^ 

™™ ..cxcT^T cr^c^^ 

T.T.C™T .T TTTTCT™ ^O.™ TXW^ 

rrr'-CAATTC ATCTCTATAA ATAAAAGGAT 
CATTTGCGAT ACCGTGAGTT ACAAGAAATC CGC.GAATTC 

TATTAACTCA AA ATG GGT TCC ATG CGT CTA TTA 


1500 


.553 


CTATATGAGA GGTAAAATCA 


r^i,r Cot- Mpt: Arq Leu Leu 
L-'ieu ^j-j' - 


355 


375 

370 

360 


3S0 

3,B 


4.0 


1601 


1649 


1697 


1745 
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1801 


OTT CAA GTACGTACTT TTTTTTTTCC TTCCAAAATG CCCTGCATAT TTAACAAGAT 


Val Gin 
425 


TGCTTTGTTC ACCTAGAAAA ATGTGTTTTT TTCAACGATC TTACGTACGT TTGTTTGGTT 1861 

XGAAAAATAA ATCAGAAAGA GATCAAGAAA ATAGCTAGAA AOAAAGCAAC GTTTTTTTAA 1321 

^GGTATTTA GTGTGAGAAA AATATTAAAA CTGAAGAGAA AGAAATTAAA TAAGCTTTTC 1981 

TTGAATGATA TTTACATGTC TTATTAACTT .^GTCACCT TTT.TCTXTA AGTTGTGCTT 2041 

GAAGAAAAAA GATGTCTTTC AGTTTAGTTT TGATTAATGC TAATTATATT TTTAATTAAT 2101 

TAATTAATAC TATATATCTA TTTACCATAT TAATTATTAC TATATTTCAT GATGACAACA 2161 

GACAAGTATT CTAAAGAGGT ATCGGTAGAT GATTAATTTT TTTATAAAAA AATCTTTTGC 2221 

GTGTATAGAT ATTCTTTTAT AATTGGTGCA GAAACTTGTA ATGCTAATTG CAATTAATCT 2281 

TACATTGATT AACTAATAGC TATAATCAAT ATTTAGGTTA GGTATAGGAG ACAAATCAAG 2341 

XGATCTG.AAC AAATTAAGTT GTTATATTTG CATTGTGACA G GGT TGT GAT GGA 

Gly Cys Asp Gly 


TCA GTT TTG CTG AAC AAC ACT GAT ACA ATA GAA AGO GAG CAA GAT GCA 


2442 


2490 
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Leu Pro Asn lie Asn Ser lie Arg Gly Leu Asp Val Val Asn Asp lie 
25 30 35 

AAG ACA GCG GTG GAA AAT AGT TGT CCA GAC ACA GTT TCT TOT GCT GAT 2 53 8 

Lys Thr Ala Val Glu Asn Ser Cys Pro Asp Thr Val Ser Cys Ala Asp 
40 45 50 

ATT CTT GCT ATT GCA GCT GAA ATA GCT TCT GTT CTG GTAATTAATA 25 34 

lie Leu Ala lie Ala Ala Glu lie Ala Ser Val Leu 
55 60 

ACTCCTAATT AATTCCCAAC CATTAAAAAG TTGCATGATT GGATTCAAAA TTCTATGGTA 2S44 


AATAAAAAAA ATTTATCTAA TTTAATTTAT TATTAAAACT ATTTTTAAAA TTCAATCCTA 2 764 


ACTCTTTTTT AATCGGAGCA TGTAAGCTGG CACCCACCGT ATATCGTTGG AAGATGCTAT 2 8 24 


AAAACCATTT AATTAATGGA TGGAATCAGT CAAAACATTT AATTCAAAAT ACTCTTAATT 2 8 84 


GTGATTAGTA ATCATGTTCG GGCAAGTTAC GTTGTGTATA ATTAATTTGA CTTAATCAGA 2 94 4 


TAAAAAAACA AATGGACGCA AGCCGGTTGG TATAGATATC ACTGGCCTGT AGAATATGTG 3 0 04 


GTTTTTCACG TTTAAATAAA AGCTAGCTAC TATATTATAT TTAGTCTTTT TTTTTCTTAA 3 064 


ACCCATTTAA CGTGATTTAT TGACTGTGAA ACATGTTTCC ACACACAGGC TTAGAAACTC 3124 


CTCGCAACTA ACATCTCCAA AATTTGACTA TTTATTTATG AAGATAATTC ATCTATGATG 3134 
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TTCAACTCTA TTATATATAT GTATCATCGC AGTATTAAGA ATTATAATAG TCAAATATAG 3 24 4 

AAGTATATCG GGTAAATGTA GTTGCATGTG CGACCTGTTT CGTGTAAAAT GCTTATTCTA 3 3 04 

TATAGCTTTT TTTATTGGAA AATAACGATG AACTAAAAAC GAAAGGGTAT CATATAGTTT 3 3 64 

GACTTTTATG TTAGAGAGAG ACATCTTAAT TTGGTCATAT GTTAAATAAT TAATTACAAT 3 4 24 

GCATACACAA ATATTTATGC CATATCTAAA AAATGATAAA ATATCATAGG TATACTCAAC 3 4 84 

TATATGATAT CCCCATAACA GAAATTGTAC TTTTCTTCAG GCAATGAACT TAACATTTCT 3 544 

GTTTGCTAAA AACAAACATC CACTTAAAGT GGTTCAACAT ATTTATGTAA TAATTTACAG 3 6 04 

GGA GGA GGT CCA GGA TGG CCA GTT CCA TTA GGA AGA AGG GAC AGC TTA 3 6 52 

Gly Gly Gly Pro Gly Trp Pro Val Pro Leu Gly Arg Arg Asp Ser Leu 
5 10 ^5 


ACA GCA AAC CGA ACC CTT OCA AAT CAA AAC CTT CCA GCA CCT TTC TTC 
Thr Ala Asn Arg Thr Leu Ala Asn Gin Asn Leu Pro Ala Pro Phe Phe 
20 25 30 

AAC CTC ACT CAA CTT AAA GCT TCC TTT GCT GTT CAA GGT CTC AAC ACC 
Asn Leu Thr Gin Leu Lys Ala Ser Phe Ala Val Gin Gly Leu Asn Thr 
35 40 45 

CTT GAT TTA GTT ACA CTC TCA GGTATACATA ATCAATTTTT TATTTGCTAT 
Leu Asp Leu Val Thr Leu Ser 

50 55 


3700 


3748 


3799 


TAGCTAGCAA 


TAAAAAGTCT CTGATACAGA CATATTTAGA TAAATTAATT TCTCCATAAA 3 859 
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C.TTTATAAT AAAATTATCA ATTTATGTAC TTAAAAATTA TGGATTGAAG CTCTTTTCAT 3.19 
CCAACTTTTA CTAAAGTTAA GGTGCATATA ATATAAAATA AACTATCTCT TGTTTCTTAT 3973 


^OATTG AAGATAAGTT AAAGTCTACT TATAAATCAT TAATATATGT ATA GGT 

Gly 


5 10 15 

, nnT r2\T CCA ACT CTG AAC ACA ACA 

TAC AAC TTC AGC AAC ACT GGA AAC CCT GAT CCA ACl 

r. A or, p^n Thr Leu Asn Thr Thr 

Tyr Asn Phe Ser Asn Thr Giy i^^n — ^ 

20 

Tyr Leu Glu Val Leu Arg Ala Arg C^s Pro Gin Asn Ala Thr Gly Asp 

40 '^^ 

35 

T^r^A rcT caT CAA TTT GAC AAC AGA 
AAC CTC ACC AAT TTG GAC CTG AGC ACA CCT GAT CAA 

TAC TAC TCC AAT CTT CTG CAG CTC AAT GGC TTA CTT CAG AGT GAC CAA 
Tyr Tyr Ser Asn Leu Leu Gin Leu Asn Gly Leu Leu Gin Ser Asp Gin 


70 


75 80 


35 


4035 


4083 


4131 


4179 


4227 


4275 


4323 
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Phe Ser Ser As 


4371 


n Gin Asn Thr Phe Phe Ser Asn Phe Arg Val Ser Met 


xoo 


.XA AAA ATO COT AAT ATT OCA GTO CTG ACT GGO GAT GAA OGA GAA ATT 
IXS 

COC TTG CAA TGT AAT TTT GTG AAT GGA GAC TCG TTT GGA TTA GOT AGT 

n ,^ r-iw a^n Ser Phe Gly Leu Ala Ser 
Arg Leu Gin Cys Asn Phe Val Asn Gly Asp ber Fn 

140 

130 

T^r^r- i-TT GTT GCT CAA TCT AAA TAA 
GTG GCG TCC AAA GAT GCT AAA CAA AAG CTT GTT GCl ^ 

T.,= T,f.u Val Ala Gin Ser Lys * 


155 


4419 


4467 


4515 


Val Ala Ser Lys Asp Aia i.ys o 

ISO 

150 


.CCAAT^TT W=GOC.Ta T.CAT.CTAO CT.CCATO™ .AO=C««TT AGGTTG.AAA «VB 
CCTCTTTCCT .^CTAT.T.C ^.^CC AACGA^GT .TGCATGTCA ATTCGATTTT = 
GCCATGTACC TCTTCGAATA TTATGTAATA ATTATTTGAA TCTCTTTAAG OTACTTAATT 4«S 

4700 

AATCA 


